Skip to main content
Glama

basic-memory

SPEC-17 Semantic Search with ChromaDB.md45.7 kB
--- title: 'SPEC-17: Semantic Search with ChromaDB' type: spec permalink: specs/spec-17-semantic-search-chromadb tags: - search - chromadb - semantic-search - vector-database - postgres-migration --- # SPEC-17: Semantic Search with ChromaDB Why ChromaDB for Knowledge Management Your users aren't just searching for keywords - they're trying to: - "Find notes related to this concept" - "Show me similar ideas" - "What else did I write about this topic?" Example: # User searches: "AI ethics" # FTS5/MeiliSearch finds: - "AI ethics guidelines" ✅ - "ethical AI development" ✅ - "artificial intelligence" ❌ No keyword match # ChromaDB finds: - "AI ethics guidelines" ✅ - "ethical AI development" ✅ - "artificial intelligence" ✅ Semantic match! - "bias in ML models" ✅ Related concept - "responsible technology" ✅ Similar theme - "neural network fairness" ✅ Connected idea ChromaDB vs MeiliSearch vs Typesense | Feature | ChromaDB | MeiliSearch | Typesense | |------------------|--------------------|--------------------|--------------------| | Semantic Search | ✅ Excellent | ❌ No | ❌ No | | Keyword Search | ⚠️ Via metadata | ✅ Excellent | ✅ Excellent | | Local Deployment | ✅ Embedded mode | ⚠️ Server required | ⚠️ Server required | | No Server Needed | ✅ YES! | ❌ No | ❌ No | | Embedding Cost | ~$0.13/1M tokens | None | None | | Search Speed | 50-200ms | 10-50ms | 10-50ms | | Best For | Semantic discovery | Exact terms | Exact terms | The Killer Feature: Embedded Mode ChromaDB has an embedded client that runs in-process - NO SERVER NEEDED! # Local (FOSS) - ChromaDB embedded in Python process import chromadb client = chromadb.PersistentClient(path="/path/to/chroma_data") collection = client.get_or_create_collection("knowledge_base") # Add documents collection.add( ids=["note1", "note2"], documents=["AI ethics", "Neural networks"], metadatas=[{"type": "note"}, {"type": "spec"}] ) # Search - NO API calls, runs locally! results = collection.query( query_texts=["machine learning"], n_results=10 ) ## Why ### Current Problem: Database Persistence in Cloud In cloud deployments, `memory.db` (SQLite) doesn't persist across Docker container restarts. This means: - Database must be rebuilt on every container restart - Initial sync takes ~49 seconds for 500 files (after optimization in #352) - Users experience delays on each deployment ### Search Architecture Issues Current SQLite FTS5 implementation creates a **dual-implementation problem** for PostgreSQL migration: - FTS5 (SQLite) uses `VIRTUAL TABLE` with `MATCH` queries - PostgreSQL full-text search uses `TSVECTOR` with `@@` operator - These are fundamentally incompatible architectures - Would require **2x search code** and **2x tests** to support both **Example of incompatibility:** ```python # SQLite FTS5 "content_stems MATCH :text" # PostgreSQL "content_vector @@ plainto_tsquery(:text)" ``` ### Search Quality Limitations Current keyword-based FTS5 has limitations: - No semantic understanding (search "AI" doesn't find "machine learning") - No word relationships (search "neural networks" doesn't find "deep learning") - Limited typo tolerance - No relevance ranking beyond keyword matching ### Strategic Goal: PostgreSQL Migration Moving to PostgreSQL (Neon) for cloud deployments would: - ✅ Solve persistence issues (database survives restarts) - ✅ Enable multi-tenant architecture - ✅ Better performance for large datasets - ✅ Support for cloud-native scaling **But requires solving the search compatibility problem.** ## What Migrate from SQLite FTS5 to **ChromaDB** for semantic vector search across all deployments. **Key insight:** ChromaDB is **database-agnostic** - it works with both SQLite and PostgreSQL, eliminating the dual-implementation problem. ### Affected Areas - Search implementation (`src/basic_memory/repository/search_repository.py`) - Search service (`src/basic_memory/services/search_service.py`) - Search models (`src/basic_memory/models/search.py`) - Database initialization (`src/basic_memory/db.py`) - MCP search tools (`src/basic_memory/mcp/tools/search.py`) - Dependencies (`pyproject.toml` - add ChromaDB) - Alembic migrations (FTS5 table removal) - Documentation ### What Changes **Removed:** - SQLite FTS5 virtual table - `MATCH` query syntax - FTS5-specific tokenization and prefix handling - ~300 lines of FTS5 query preparation code **Added:** - ChromaDB persistent client (embedded mode) - Vector embedding generation - Semantic similarity search - Local embedding model (`sentence-transformers`) - Collection management for multi-project support ### What Stays the Same - Search API interface (MCP tools, REST endpoints) - Entity/Observation/Relation indexing workflow - Multi-project isolation - Search filtering by type, date, metadata - Pagination and result formatting - **All SQL queries for exact lookups and metadata filtering** ## Hybrid Architecture: SQL + ChromaDB **Critical Design Decision:** ChromaDB **complements** SQL, it doesn't **replace** it. ### Why Hybrid? ChromaDB is excellent for semantic text search but terrible for exact lookups. SQL is perfect for exact lookups and structured queries. We use both: ``` ┌─────────────────────────────────────────────────┐ │ Search Request │ └─────────────────────────────────────────────────┘ ▼ ┌────────────────────────┐ │ SearchRepository │ │ (Smart Router) │ └────────────────────────┘ ▼ ▼ ┌───────────┐ ┌──────────────┐ │ SQL │ │ ChromaDB │ │ Queries │ │ Semantic │ └───────────┘ └──────────────┘ ▼ ▼ Exact lookups Text search - Permalink - Semantic similarity - Pattern match - Related concepts - Title exact - Typo tolerance - Metadata filter - Fuzzy matching - Date ranges ``` ### When to Use Each #### Use SQL For (Fast & Exact) **Exact Permalink Lookup:** ```python # Find by exact permalink - SQL wins "SELECT * FROM entities WHERE permalink = 'specs/search-feature'" # ~1ms, perfect for exact matches # ChromaDB would be: ~50ms, wasteful ``` **Pattern Matching:** ```python # Find all specs - SQL wins "SELECT * FROM entities WHERE permalink GLOB 'specs/*'" # ~5ms, perfect for wildcards # ChromaDB doesn't support glob patterns ``` **Pure Metadata Queries:** ```python # Find all meetings tagged "important" - SQL wins "SELECT * FROM entities WHERE json_extract(entity_metadata, '$.entity_type') = 'meeting' AND json_extract(entity_metadata, '$.tags') LIKE '%important%'" # ~5ms, structured query # No text search needed, SQL is faster and simpler ``` **Date Filtering:** ```python # Find recent specs - SQL wins "SELECT * FROM entities WHERE entity_type = 'spec' AND created_at > '2024-01-01' ORDER BY created_at DESC" # ~2ms, perfect for structured data ``` #### Use ChromaDB For (Semantic & Fuzzy) **Semantic Content Search:** ```python # Find notes about "neural networks" - ChromaDB wins collection.query(query_texts=["neural networks"]) # Finds: "machine learning", "deep learning", "AI models" # ~50-100ms, semantic understanding # SQL FTS5 would only find exact keyword matches ``` **Text Search + Metadata:** ```python # Find meeting notes about "project planning" tagged "important" collection.query( query_texts=["project planning"], where={ "entity_type": "meeting", "tags": {"$contains": "important"} } ) # ~100ms, semantic search with filters # Finds: "roadmap discussion", "sprint planning", etc. ``` **Typo Tolerance:** ```python # User types "serch feature" (typo) - ChromaDB wins collection.query(query_texts=["serch feature"]) # Still finds: "search feature" documents # ~50-100ms, fuzzy matching # SQL would find nothing ``` ### Performance Comparison | Query Type | SQL | ChromaDB | Winner | |-----------|-----|----------|--------| | Exact permalink | 1-2ms | 50ms | ✅ SQL | | Pattern match (specs/*) | 5-10ms | N/A | ✅ SQL | | Pure metadata filter | 5ms | 50ms | ✅ SQL | | Semantic text search | ❌ Can't | 50-100ms | ✅ ChromaDB | | Text + metadata | ❌ Keywords only | 100ms | ✅ ChromaDB | | Typo tolerance | ❌ Can't | 50ms | ✅ ChromaDB | ### Metadata/Frontmatter Handling **Both systems support full frontmatter filtering!** #### SQL Metadata Storage ```python # Entities table stores frontmatter as JSON CREATE TABLE entities ( id INTEGER PRIMARY KEY, title TEXT, permalink TEXT, file_path TEXT, entity_type TEXT, entity_metadata JSON, -- All frontmatter here! created_at DATETIME, ... ) # Query frontmatter fields SELECT * FROM entities WHERE json_extract(entity_metadata, '$.entity_type') = 'meeting' AND json_extract(entity_metadata, '$.tags') LIKE '%important%' AND json_extract(entity_metadata, '$.status') = 'completed' ``` #### ChromaDB Metadata Storage ```python # When indexing, store ALL frontmatter as metadata class ChromaSearchBackend: async def index_entity(self, entity: Entity): """Index with complete frontmatter metadata.""" # Extract ALL frontmatter fields metadata = { "entity_id": entity.id, "project_id": entity.project_id, "permalink": entity.permalink, "file_path": entity.file_path, "entity_type": entity.entity_type, "type": "entity", # ALL frontmatter tags "tags": entity.entity_metadata.get("tags", []), # Custom frontmatter fields "status": entity.entity_metadata.get("status"), "priority": entity.entity_metadata.get("priority"), # Spread any other custom fields **{k: v for k, v in entity.entity_metadata.items() if k not in ["tags", "entity_type"]} } self.collection.upsert( ids=[f"entity_{entity.id}_{entity.project_id}"], documents=[self._format_document(entity)], metadatas=[metadata] # Full frontmatter! ) ``` #### ChromaDB Metadata Queries ChromaDB supports rich filtering: ```python # Simple filter - single field collection.query( query_texts=["project planning"], where={"entity_type": "meeting"} ) # Multiple conditions (AND) collection.query( query_texts=["architecture decisions"], where={ "entity_type": "spec", "tags": {"$contains": "important"} } ) # Complex filters with operators collection.query( query_texts=["machine learning"], where={ "$and": [ {"entity_type": {"$in": ["note", "spec"]}}, {"tags": {"$contains": "AI"}}, {"created_at": {"$gt": "2024-01-01"}}, {"status": "in-progress"} ] } ) # Multiple tags (all must match) collection.query( query_texts=["cloud architecture"], where={ "$and": [ {"tags": {"$contains": "architecture"}}, {"tags": {"$contains": "cloud"}} ] } ) ``` ### Smart Routing Implementation ```python class SearchRepository: def __init__( self, session_maker: async_sessionmaker[AsyncSession], project_id: int, chroma_backend: ChromaSearchBackend ): self.sql = session_maker # Keep SQL! self.chroma = chroma_backend self.project_id = project_id async def search( self, search_text: Optional[str] = None, permalink: Optional[str] = None, permalink_match: Optional[str] = None, title: Optional[str] = None, types: Optional[List[str]] = None, tags: Optional[List[str]] = None, after_date: Optional[datetime] = None, custom_metadata: Optional[dict] = None, limit: int = 10, offset: int = 0, ) -> List[SearchIndexRow]: """Smart routing between SQL and ChromaDB.""" # ========================================== # Route 1: Exact Lookups → SQL (1-5ms) # ========================================== if permalink: # Exact permalink: "specs/search-feature" return await self._sql_permalink_lookup(permalink) if permalink_match: # Pattern match: "specs/*" return await self._sql_pattern_match(permalink_match) if title and not search_text: # Exact title lookup (no semantic search needed) return await self._sql_title_match(title) # ========================================== # Route 2: Pure Metadata → SQL (5-10ms) # ========================================== # No text search, just filtering by metadata if not search_text and (types or tags or after_date or custom_metadata): return await self._sql_metadata_filter( types=types, tags=tags, after_date=after_date, custom_metadata=custom_metadata, limit=limit, offset=offset ) # ========================================== # Route 3: Text Search → ChromaDB (50-100ms) # ========================================== if search_text: # Build ChromaDB metadata filters where_filters = self._build_chroma_filters( types=types, tags=tags, after_date=after_date, custom_metadata=custom_metadata ) # Semantic search with metadata filtering return await self.chroma.search( query_text=search_text, project_id=self.project_id, where=where_filters, limit=limit ) # ========================================== # Route 4: List All → SQL (2-5ms) # ========================================== return await self._sql_list_entities( limit=limit, offset=offset ) def _build_chroma_filters( self, types: Optional[List[str]] = None, tags: Optional[List[str]] = None, after_date: Optional[datetime] = None, custom_metadata: Optional[dict] = None ) -> dict: """Build ChromaDB where clause from filters.""" filters = {"project_id": self.project_id} # Type filtering if types: if len(types) == 1: filters["entity_type"] = types[0] else: filters["entity_type"] = {"$in": types} # Tag filtering (array contains) if tags: if len(tags) == 1: filters["tags"] = {"$contains": tags[0]} else: # Multiple tags - all must match filters = { "$and": [ filters, *[{"tags": {"$contains": tag}} for tag in tags] ] } # Date filtering if after_date: filters["created_at"] = {"$gt": after_date.isoformat()} # Custom frontmatter fields if custom_metadata: filters.update(custom_metadata) return filters async def _sql_metadata_filter( self, types: Optional[List[str]] = None, tags: Optional[List[str]] = None, after_date: Optional[datetime] = None, custom_metadata: Optional[dict] = None, limit: int = 10, offset: int = 0 ) -> List[SearchIndexRow]: """Pure metadata queries using SQL.""" conditions = ["project_id = :project_id"] params = {"project_id": self.project_id} if types: type_list = ", ".join(f"'{t}'" for t in types) conditions.append(f"entity_type IN ({type_list})") if tags: # Check each tag for i, tag in enumerate(tags): param_name = f"tag_{i}" conditions.append( f"json_extract(entity_metadata, '$.tags') LIKE :{param_name}" ) params[param_name] = f"%{tag}%" if after_date: conditions.append("created_at > :after_date") params["after_date"] = after_date if custom_metadata: for key, value in custom_metadata.items(): param_name = f"meta_{key}" conditions.append( f"json_extract(entity_metadata, '$.{key}') = :{param_name}" ) params[param_name] = value where = " AND ".join(conditions) sql = f""" SELECT * FROM entities WHERE {where} ORDER BY created_at DESC LIMIT :limit OFFSET :offset """ params["limit"] = limit params["offset"] = offset async with db.scoped_session(self.session_maker) as session: result = await session.execute(text(sql), params) return self._format_sql_results(result) ``` ### Real-World Examples #### Example 1: Pure Metadata Query (No Text) ```python # "Find all meetings tagged 'important'" results = await search_repo.search( types=["meeting"], tags=["important"] ) # Routing: → SQL (~5ms) # SQL: SELECT * FROM entities # WHERE entity_type = 'meeting' # AND json_extract(entity_metadata, '$.tags') LIKE '%important%' ``` #### Example 2: Semantic Search (No Metadata) ```python # "Find notes about neural networks" results = await search_repo.search( search_text="neural networks" ) # Routing: → ChromaDB (~80ms) # Finds: "machine learning", "deep learning", "AI models", etc. ``` #### Example 3: Semantic + Metadata ```python # "Find meeting notes about 'project planning' tagged 'important'" results = await search_repo.search( search_text="project planning", types=["meeting"], tags=["important"] ) # Routing: → ChromaDB with filters (~100ms) # ChromaDB: query_texts=["project planning"] # where={"entity_type": "meeting", # "tags": {"$contains": "important"}} # Finds: "roadmap discussion", "sprint planning", etc. ``` #### Example 4: Complex Frontmatter Query ```python # "Find in-progress specs with multiple tags, recent" results = await search_repo.search( types=["spec"], tags=["architecture", "cloud"], after_date=datetime(2024, 1, 1), custom_metadata={"status": "in-progress"} ) # Routing: → SQL (~10ms) # No text search, pure structured query - SQL is faster ``` #### Example 5: Semantic + Complex Metadata ```python # "Find notes about 'authentication' that are in-progress" results = await search_repo.search( search_text="authentication", custom_metadata={"status": "in-progress", "priority": "high"} ) # Routing: → ChromaDB with metadata filters (~100ms) # Semantic search for "authentication" concept # Filters by status and priority in metadata ``` #### Example 6: Exact Permalink ```python # "Show me specs/search-feature" results = await search_repo.search( permalink="specs/search-feature" ) # Routing: → SQL (~1ms) # SQL: SELECT * FROM entities WHERE permalink = 'specs/search-feature' ``` #### Example 7: Pattern Match ```python # "Show me all specs" results = await search_repo.search( permalink_match="specs/*" ) # Routing: → SQL (~5ms) # SQL: SELECT * FROM entities WHERE permalink GLOB 'specs/*' ``` ### What We Remove vs Keep **REMOVE (FTS5-specific):** - ❌ `CREATE VIRTUAL TABLE search_index USING fts5(...)` - ❌ `MATCH` operator queries - ❌ FTS5 tokenization configuration - ❌ ~300 lines of FTS5 query preparation code - ❌ Trigram generation and prefix handling **KEEP (Standard SQL):** - ✅ `SELECT * FROM entities WHERE permalink = :permalink` - ✅ `SELECT * FROM entities WHERE permalink GLOB :pattern` - ✅ `SELECT * FROM entities WHERE title LIKE :title` - ✅ `SELECT * FROM entities WHERE json_extract(entity_metadata, ...) = :value` - ✅ All date filtering, pagination, sorting - ✅ Entity table structure and indexes **ADD (ChromaDB):** - ✅ ChromaDB persistent client (embedded) - ✅ Semantic vector search - ✅ Metadata filtering in ChromaDB - ✅ Smart routing logic ## How (High Level) ### Architecture Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ FOSS Deployment (Local) │ ├─────────────────────────────────────────────────────────────┤ │ SQLite (data) + ChromaDB embedded (search) │ │ - No external services │ │ - Local embedding model (sentence-transformers) │ │ - Persists in ~/.basic-memory/chroma_data/ │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Cloud Deployment (Multi-tenant) │ ├─────────────────────────────────────────────────────────────┤ │ PostgreSQL/Neon (data) + ChromaDB server (search) │ │ - Neon serverless Postgres for persistence │ │ - ChromaDB server in Docker container │ │ - Optional: OpenAI embeddings for better quality │ └─────────────────────────────────────────────────────────────┘ ``` ### Phase 1: ChromaDB Integration (2-3 days) #### 1. Add ChromaDB Dependency ```toml # pyproject.toml dependencies = [ "chromadb>=0.4.0", "sentence-transformers>=2.2.0", # Local embeddings ] ``` #### 2. Create ChromaSearchBackend ```python # src/basic_memory/search/chroma_backend.py from chromadb import PersistentClient from chromadb.utils import embedding_functions class ChromaSearchBackend: def __init__( self, persist_directory: Path, collection_name: str = "knowledge_base", embedding_model: str = "all-MiniLM-L6-v2" ): """Initialize ChromaDB with local embeddings.""" self.client = PersistentClient(path=str(persist_directory)) # Use local sentence-transformers model (no API costs) self.embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction( model_name=embedding_model ) self.collection = self.client.get_or_create_collection( name=collection_name, embedding_function=self.embed_fn, metadata={"hnsw:space": "cosine"} # Similarity metric ) async def index_entity(self, entity: Entity): """Index entity with automatic embeddings.""" # Combine title and content for semantic search document = self._format_document(entity) self.collection.upsert( ids=[f"entity_{entity.id}_{entity.project_id}"], documents=[document], metadatas=[{ "entity_id": entity.id, "project_id": entity.project_id, "permalink": entity.permalink, "file_path": entity.file_path, "entity_type": entity.entity_type, "type": "entity", }] ) async def search( self, query_text: str, project_id: int, limit: int = 10, filters: dict = None ) -> List[SearchResult]: """Semantic search with metadata filtering.""" where = {"project_id": project_id} if filters: where.update(filters) results = self.collection.query( query_texts=[query_text], n_results=limit, where=where ) return self._format_results(results) ``` #### 3. Update SearchRepository ```python # src/basic_memory/repository/search_repository.py class SearchRepository: def __init__( self, session_maker: async_sessionmaker[AsyncSession], project_id: int, chroma_backend: ChromaSearchBackend ): self.session_maker = session_maker self.project_id = project_id self.chroma = chroma_backend async def search( self, search_text: Optional[str] = None, permalink: Optional[str] = None, # ... other filters ) -> List[SearchIndexRow]: """Search using ChromaDB for text, SQL for exact lookups.""" # For exact permalink/pattern matches, use SQL if permalink or permalink_match: return await self._sql_exact_search(...) # For text search, use ChromaDB semantic search if search_text: results = await self.chroma.search( query_text=search_text, project_id=self.project_id, limit=limit, filters=self._build_filters(types, after_date, ...) ) return results # Fallback to listing all return await self._list_entities(...) ``` #### 4. Update SearchService ```python # src/basic_memory/services/search_service.py class SearchService: def __init__( self, search_repository: SearchRepository, entity_repository: EntityRepository, file_service: FileService, chroma_backend: ChromaSearchBackend, ): self.repository = search_repository self.entity_repository = entity_repository self.file_service = file_service self.chroma = chroma_backend async def index_entity(self, entity: Entity): """Index entity in ChromaDB.""" if entity.is_markdown: await self._index_entity_markdown(entity) else: await self._index_entity_file(entity) async def _index_entity_markdown(self, entity: Entity): """Index markdown entity with full content.""" # Index entity await self.chroma.index_entity(entity) # Index observations (as separate documents) for obs in entity.observations: await self.chroma.index_observation(obs, entity) # Index relations (metadata only) for rel in entity.outgoing_relations: await self.chroma.index_relation(rel, entity) ``` ### Phase 2: PostgreSQL Support (1 day) #### 1. Add PostgreSQL Database Type ```python # src/basic_memory/db.py class DatabaseType(Enum): MEMORY = auto() FILESYSTEM = auto() POSTGRESQL = auto() # NEW @classmethod def get_db_url(cls, db_path_or_url: str, db_type: "DatabaseType") -> str: if db_type == cls.POSTGRESQL: return db_path_or_url # Neon connection string elif db_type == cls.MEMORY: return "sqlite+aiosqlite://" return f"sqlite+aiosqlite:///{db_path_or_url}" ``` #### 2. Update Connection Handling ```python def _create_engine_and_session(...): db_url = DatabaseType.get_db_url(db_path_or_url, db_type) if db_type == DatabaseType.POSTGRESQL: # Use asyncpg driver for Postgres engine = create_async_engine( db_url, pool_size=10, max_overflow=20, pool_pre_ping=True, # Health checks ) else: # SQLite configuration engine = create_async_engine(db_url, connect_args=connect_args) # Only configure SQLite-specific settings for SQLite if db_type != DatabaseType.MEMORY: @event.listens_for(engine.sync_engine, "connect") def enable_wal_mode(dbapi_conn, connection_record): _configure_sqlite_connection(dbapi_conn, enable_wal=True) return engine, async_sessionmaker(engine, expire_on_commit=False) ``` #### 3. Remove SQLite-Specific Code ```python # Remove from scoped_session context manager: # await session.execute(text("PRAGMA foreign_keys=ON")) # DELETE # PostgreSQL handles foreign keys by default ``` ### Phase 3: Migration & Testing (1-2 days) #### 1. Create Migration Script ```python # scripts/migrate_to_chromadb.py async def migrate_fts5_to_chromadb(): """One-time migration from FTS5 to ChromaDB.""" # 1. Read all entities from database entities = await entity_repository.find_all() # 2. Index in ChromaDB for entity in entities: await search_service.index_entity(entity) # 3. Drop FTS5 table (Alembic migration) await session.execute(text("DROP TABLE IF EXISTS search_index")) ``` #### 2. Update Tests - Replace FTS5 test fixtures with ChromaDB fixtures - Test semantic search quality - Test multi-project isolation in ChromaDB - Benchmark performance vs FTS5 #### 3. Documentation Updates - Update search documentation - Add ChromaDB configuration guide - Document embedding model options - PostgreSQL deployment guide ### Configuration ```python # config.py class BasicMemoryConfig: # Database database_type: DatabaseType = DatabaseType.FILESYSTEM database_path: Path = Path.home() / ".basic-memory" / "memory.db" database_url: Optional[str] = None # For Postgres: postgresql://... # Search chroma_persist_directory: Path = Path.home() / ".basic-memory" / "chroma_data" embedding_model: str = "all-MiniLM-L6-v2" # Local model embedding_provider: str = "local" # or "openai" openai_api_key: Optional[str] = None # For cloud deployments ``` ### Deployment Configurations #### Local (FOSS) ```yaml # Default configuration database_type: FILESYSTEM database_path: ~/.basic-memory/memory.db chroma_persist_directory: ~/.basic-memory/chroma_data embedding_model: all-MiniLM-L6-v2 embedding_provider: local ``` #### Cloud (Docker Compose) ```yaml services: postgres: image: postgres:15 environment: POSTGRES_DB: basic_memory POSTGRES_PASSWORD: ${DB_PASSWORD} chromadb: image: chromadb/chroma:latest volumes: - chroma_data:/chroma/chroma environment: ALLOW_RESET: true app: environment: DATABASE_TYPE: POSTGRESQL DATABASE_URL: postgresql://postgres:${DB_PASSWORD}@postgres/basic_memory CHROMA_HOST: chromadb CHROMA_PORT: 8000 EMBEDDING_PROVIDER: local # or openai ``` ## How to Evaluate ### Success Criteria #### Functional Requirements - ✅ Semantic search finds related concepts (e.g., "AI" finds "machine learning") - ✅ Exact permalink/pattern matches work (e.g., `specs/*`) - ✅ Multi-project isolation maintained - ✅ All existing search filters work (type, date, metadata) - ✅ MCP tools continue to work without changes - ✅ Works with both SQLite and PostgreSQL #### Performance Requirements - ✅ Search latency < 200ms for 1000 documents (local embedding) - ✅ Indexing time comparable to FTS5 (~10 files/sec) - ✅ Initial sync time not significantly worse than current - ✅ Memory footprint < 1GB for local deployments #### Quality Requirements - ✅ Better search relevance than FTS5 keyword matching - ✅ Handles typos and word variations - ✅ Finds semantically similar content #### Deployment Requirements - ✅ FOSS: Works out-of-box with no external services - ✅ Cloud: Integrates with PostgreSQL (Neon) - ✅ No breaking changes to MCP API - ✅ Migration script for existing users ### Testing Procedure #### 1. Unit Tests ```bash # Test ChromaDB backend pytest tests/test_chroma_backend.py # Test search repository with ChromaDB pytest tests/test_search_repository.py # Test search service pytest tests/test_search_service.py ``` #### 2. Integration Tests ```bash # Test full search workflow pytest test-int/test_search_integration.py # Test with PostgreSQL DATABASE_TYPE=POSTGRESQL pytest test-int/ ``` #### 3. Semantic Search Quality Tests ```python # Test semantic similarity search("machine learning") should find: - "neural networks" - "deep learning" - "AI algorithms" search("software architecture") should find: - "system design" - "design patterns" - "microservices" ``` #### 4. Performance Benchmarks ```bash # Run search benchmarks pytest test-int/test_search_performance.py -v # Measure: - Search latency (should be < 200ms) - Indexing throughput (should be ~10 files/sec) - Memory usage (should be < 1GB) ``` #### 5. Migration Testing ```bash # Test migration from FTS5 to ChromaDB python scripts/migrate_to_chromadb.py # Verify all entities indexed # Verify search results quality # Verify no data loss ``` ### Metrics **Search Quality:** - Semantic relevance score (manual evaluation) - Precision/recall for common queries - User satisfaction (qualitative) **Performance:** - Average search latency (ms) - P95/P99 search latency - Indexing throughput (files/sec) - Memory usage (MB) **Deployment:** - Local deployment success rate - Cloud deployment success rate - Migration success rate ## Implementation Checklist ### Phase 1: ChromaDB Integration - [ ] Add ChromaDB and sentence-transformers dependencies - [ ] Create ChromaSearchBackend class - [ ] Update SearchRepository to use ChromaDB - [ ] Update SearchService indexing methods - [ ] Remove FTS5 table creation code - [ ] Update search query logic - [ ] Add ChromaDB configuration to BasicMemoryConfig ### Phase 2: PostgreSQL Support - [ ] Add DatabaseType.POSTGRESQL enum - [ ] Update get_db_url() for Postgres connection strings - [ ] Add asyncpg dependency - [ ] Update engine creation for Postgres - [ ] Remove SQLite-specific PRAGMA statements - [ ] Test with Neon database ### Phase 3: Testing & Migration - [ ] Write unit tests for ChromaSearchBackend - [ ] Update search integration tests - [ ] Add semantic search quality tests - [ ] Create performance benchmarks - [ ] Write migration script from FTS5 - [ ] Test migration with existing data - [ ] Update documentation ### Phase 4: Deployment - [ ] Update docker-compose.yml for cloud - [ ] Document local FOSS deployment - [ ] Document cloud PostgreSQL deployment - [ ] Create migration guide for users - [ ] Update MCP tool documentation ## Notes ### Embedding Model Trade-offs **Local Model: `all-MiniLM-L6-v2`** - Size: 80MB download - Speed: ~50ms embedding time - Dimensions: 384 - Cost: $0 - Quality: Good for general knowledge - Best for: FOSS deployments **OpenAI: `text-embedding-3-small`** - Speed: ~100-200ms (API call) - Dimensions: 1536 - Cost: ~$0.13 per 1M tokens (~$0.01 per 1000 notes) - Quality: Excellent - Best for: Cloud deployments with budget ### ChromaDB Storage ChromaDB stores data in: ``` ~/.basic-memory/chroma_data/ ├── chroma.sqlite3 # Metadata ├── index/ # HNSW indexes └── collections/ # Vector data ``` Typical sizes: - 100 notes: ~5MB - 1000 notes: ~50MB - 10000 notes: ~500MB ### Why Not Keep FTS5? **Considered:** Hybrid approach (FTS5 for SQLite + tsvector for Postgres) **Rejected because:** - 2x the code to maintain - 2x the tests to write - 2x the bugs to fix - Inconsistent search behavior between deployments - ChromaDB provides better search quality anyway **ChromaDB wins:** - One implementation for both databases - Better search quality (semantic!) - Database-agnostic architecture - Embedded mode for FOSS (no servers needed) ## implementation Proposed Architecture Option 1: ChromaDB Only (Simplest) class ChromaSearchBackend: def __init__(self, path: str, embedding_model: str = "all-MiniLM-L6-v2"):yes # For local: embedded client (no server!) self.client = chromadb.PersistentClient(path=path) # Use local embedding model (no API costs!) from chromadb.utils import embedding_functions self.embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction( model_name=embedding_model ) self.collection = self.client.get_or_create_collection( name="knowledge_base", embedding_function=self.embed_fn ) async def index_entity(self, entity: Entity): # ChromaDB handles embeddings automatically! self.collection.upsert( ids=[str(entity.id)], documents=[f"{entity.title}\n{entity.content}"], metadatas=[{ "permalink": entity.permalink, "type": entity.entity_type, "file_path": entity.file_path }] ) async def search(self, query: str, filters: dict = None): # Semantic search with optional metadata filters results = self.collection.query( query_texts=[query], n_results=10, where=filters # e.g., {"type": "note"} ) return results Deployment: - Local (FOSS): ChromaDB embedded, local embedding model, NO servers - Cloud: ChromaDB server OR still embedded (it's just a Python lib!) Option 2: Hybrid FTS + ChromaDB (Best UX) class HybridSearchBackend: def __init__(self): self.fts = SQLiteFTS5Backend() # Fast keyword search self.chroma = ChromaSearchBackend() # Semantic search async def search(self, query: str, search_type: str = "auto"): if search_type == "exact": # User wants exact match: "specs/search-feature" return await self.fts.search(query) elif search_type == "semantic": # User wants related concepts return await self.chroma.search(query) else: # "auto" # Check if query looks like exact match if "/" in query or query.startswith('"'): return await self.fts.search(query) # Otherwise use semantic search return await self.chroma.search(query) Embedding Options Option A: Local Model (FREE, FOSS-friendly) # Uses sentence-transformers (runs locally) # Model: ~100MB download # Speed: ~50-100ms for embedding # Cost: $0 from chromadb.utils import embedding_functions embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction( model_name="all-MiniLM-L6-v2" # Fast, accurate, free ) Option B: OpenAI Embeddings (Cloud only) # For cloud users who want best quality # Model: text-embedding-3-small # Speed: ~100-200ms via API # Cost: ~$0.13 per 1M tokens (~$0.01 per 1000 notes) embed_fn = embedding_functions.OpenAIEmbeddingFunction( api_key="...", model_name="text-embedding-3-small" ) Performance Comparison Local embedding model: all-MiniLM-L6-v2 Embedding time: ~50ms per note Search time: ~100ms for 1000 notes Memory: ~500MB (model + ChromaDB) Cost: $0 Quality: Good (384 dimensions) OpenAI embeddings: text-embedding-3-small Embedding time: ~100-200ms per note (API call) Search time: ~50ms for 1000 notes Cost: ~$0.01 per 1000 notes Quality: Excellent (1536 dimensions) My Recommendation: ChromaDB with Local Embeddings Here's the plan: Phase 1: Local ChromaDB (1-2 days) # FOSS version - SQLite for data persistence - ChromaDB embedded for semantic search - Local embedding model (no API costs) - NO external services required Benefits: - ✅ Same deployment as current (just Python package) - ✅ Semantic search for better UX - ✅ Free embeddings with local model - ✅ No servers needed Phase 2: Postgres + ChromaDB Cloud (1-2 days) # Cloud version - Postgres for data persistence - ChromaDB server for semantic search - OpenAI embeddings (higher quality) - OR keep local embeddings (cheaper) Phase 3: Hybrid Search (optional, 1 day) # Add FTS for exact matches alongside ChromaDB - Quick keyword search when needed - Semantic search for exploration - Best of both worlds Code Estimate Just ChromaDB (replacing FTS5): - Remove FTS5 code: 2 hours - Add ChromaDB backend: 4 hours - Update search service: 2 hours - Testing: 4 hours - Total: 1.5 days ChromaDB + Postgres migration: - Add Postgres support: 4 hours - Test with Neon: 2 hours - Total: +0.75 days Grand total: 2-3 days for complete migration The Kicker ChromaDB solves BOTH problems: 1. ✅ Works with SQLite AND Postgres (it's separate!) 2. ✅ No server needed for local (embedded mode) 3. ✅ Better search than FTS5 (semantic!) 4. ✅ One implementation for both deployments Want me to prototype this? I can show you: 1. ChromaDB embedded with local embeddings 2. Example searches showing semantic matching 3. Performance benchmarks 4. Migration from FTS5 ## Observations - [problem] SQLite FTS5 and PostgreSQL tsvector are incompatible architectures requiring dual implementation #database-compatibility - [problem] Cloud deployments lose database on container restart requiring full re-sync #persistence - [solution] ChromaDB provides database-agnostic semantic search eliminating dual implementation #architecture - [advantage] Semantic search finds related concepts beyond keyword matching improving UX #search-quality - [deployment] Embedded ChromaDB requires no external services for FOSS #simplicity - [migration] Moving to PostgreSQL solves cloud persistence issues #cloud-architecture - [performance] Local embedding models provide good quality at zero cost #cost-optimization - [trade-off] Embedding generation adds ~50ms latency vs instant FTS5 indexing #performance - [benefit] Single search codebase reduces maintenance burden and test coverage needs #maintainability ## Prior Art / References ### Community Fork: manuelbliemel/basic-memory (feature/vector-search) **Repository**: https://github.com/manuelbliemel/basic-memory/tree/feature/vector-search **Key Implementation Details**: **Vector Database**: ChromaDB (same as our approach!) **Embedding Models**: - Local: `all-MiniLM-L6-v2` (default, 384 dims) - same model we planned - Also supports: `all-mpnet-base-v2`, `paraphrase-MiniLM-L6-v2`, `multi-qa-MiniLM-L6-cos-v1` - OpenAI: `text-embedding-ada-002`, `text-embedding-3-small`, `text-embedding-3-large` **Chunking Strategy** (interesting - we didn't consider this): - Chunk Size: 500 characters - Chunk Overlap: 50 characters - Breaks documents into smaller pieces for better semantic search **Search Strategies**: 1. `fuzzy_only` (default) - FTS5 only 2. `vector_only` - ChromaDB only 3. `hybrid` (recommended) - Both FTS5 + ChromaDB 4. `fuzzy_primary` - FTS5 first, ChromaDB fallback 5. `vector_primary` - ChromaDB first, FTS5 fallback **Configuration**: - Similarity Threshold: 0.1 - Max Results: 5 - Storage: `~/.basic-memory/chroma/` - Config: `~/.basic-memory/config.json` **Key Differences from Our Approach**: | Aspect | Their Approach | Our Approach | |--------|---------------|--------------| | FTS5 | Keep FTS5 + add ChromaDB | Remove FTS5, use SQL for exact lookups | | Search Strategy | 5 configurable strategies | Smart routing (automatic) | | Document Processing | Chunk into 500-char pieces | Index full documents | | Hybrid Mode | Run both, merge, dedupe | Route to best backend | | Configuration | User-configurable strategy | Automatic based on query type | **What We Can Learn**: 1. **Chunking**: Breaking documents into 500-character chunks with 50-char overlap may improve semantic search quality for long documents - Pro: Better granularity for semantic matching - Con: More vectors to store and search - Consider: Optional chunking for large documents (>2000 chars) 2. **Configurable Strategies**: Allowing users to choose search strategy provides flexibility - Pro: Power users can tune behavior - Con: More complexity, most users won't configure - Consider: Default to smart routing, allow override via config 3. **Similarity Threshold**: They use 0.1 as default - Consider: Benchmark different thresholds for quality 4. **Storage Location**: `~/.basic-memory/chroma/` matches our planned `chroma_data/` approach **Potential Collaboration**: - Their implementation is nearly complete as a fork - Could potentially merge their work or use as reference implementation - Their chunking strategy could be valuable addition to our approach ## Relations - implements [[SPEC-11 Basic Memory API Performance Optimization]] - relates_to [[Performance Optimizations Documentation]] - enables [[PostgreSQL Migration]] - improves_on [[SQLite FTS5 Search]] - references [[manuelbliemel/basic-memory feature/vector-search fork]]

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/basicmachines-co/basic-memory'

If you have feedback or need assistance with the MCP directory API, please join our Discord server